10 research outputs found

    Bridging the Domain Gap for Stance Detection for the Zulu language

    Get PDF
    Misinformation has become a major concern in recent last years given its spread across our information sources. In the past years, many NLP tasks have been introduced in this area, with some systems reaching good results on English language datasets. Existing AI based approaches for fighting misinformation in literature suggest automatic stance detection as an integral first step to success. Our paper aims at utilizing this progress made for English to transfers that knowledge into other languages, which is a non-trivial task due to the domain gap between English and the target languages. We propose a black-box non-intrusive method that utilizes techniques from Domain Adaptation to reduce the domain gap, without requiring any human expertise in the target language, by leveraging low-quality data in both a supervised and unsupervised manner. This allows us to rapidly achieve similar results for stance detection for the Zulu language, the target language in this work, as are found for English. We also provide a stance detection dataset in the Zulu language. Our experimental results show that by leveraging English datasets and machine translation we can increase performances on both English data along with other languages.Comment: accepted to Intellisy

    Abstraction-Based Outlier Detection for Image Data

    Get PDF
    © 2021, Springer Nature Switzerland AG. Data plays an important role in all stages of training, and usage of machine learning algorithms. Outliers are the samples in data that are generated by a “different mechanism” and belong to unexpected patterns that do not conform to normal behaviour. Outlier detection techniques try to deal with such undesirable events. There have been exceptional success of deep learning over classical methods in computer vision. In recent years a number of works employed the representation learning ability of deep autoencoders or Generative Adversarial Networks for outlier detection. Basically, methods are based on plugging representation techniques to outlier detection methods or directly reported employing reconstruction error as an outlier score. The error distributions of inliers and outliers may be still significantly overlapped. This could be associated with variation of samples inside the class, or cases with high outliers ratios, etc. In these cases, simply thresholding reconstruction errors may lead to misclassification. Although the produced representation is perhaps effective in representing the common features of the normal data, it is not necessarily effective in distinguishing outliers from inliers. We present a method that is based on constructing new features using convolutional variational autoencoder (VAE) and generate abstraction based on these features. To identify anomaly detection we tested two scenarios: utilizing VAE itself as well as using abstractions to train an additional architecture. Results are presented in the form of AUC-ROC using four benchmark datasets

    Domain Adaptation for Car Accident Detection in Videos

    Get PDF
    © 2019 IEEE. In this paper, we implement a deep learning model for car accident detection using synthetic videos while adapting the model, using domain adaptation (DA), to real videos from CCTV traffic cameras. The synthetic data are rendered using a video game. The reason to use such data is the lack of real videos of car crashes from CCTV. Though a video game may allow us to generate car crashes in a variety of scenarios, the distinction in synthetic and real videos can negatively affect the model\u27s performance. Accordingly, our aim is three-fold: render numerous synthetic videos having significant variations, train a 3D CNN based deep model on the collected videos, and use DA to adapt the model from synthetic to real videos. Our experimental results, obtained under a variety of experimental setups, demonstrate the feasibility of using our approach for car accident detection in real videos

    Knowledge Graph Embedding-Based Domain Adaptation for Musical Instrument Recognition

    No full text
    Convolutional neural networks raised the bar for machine learning and artificial intelligence applications, mainly due to the abundance of data and computations. However, there is not always enough data for training, especially when it comes to historical collections of cultural heritage where the original artworks have been destroyed or damaged over time. Transfer Learning and domain adaptation techniques are possible solutions to tackle the issue of data scarcity. This article presents a new method for domain adaptation based on Knowledge graph embeddings. Knowledge Graph embedding forms a projection of a knowledge graph into a lower-dimensional where entities and relations are represented into continuous vector spaces. Our method incorporates these semantic vector spaces as a key ingredient to guide the domain adaptation process. We combined knowledge graph embeddings with visual embeddings from the images and trained a neural network with the combined embeddings as anchors using an extension of Fisher’s linear discriminant. We evaluated our approach on two cultural heritage datasets of images containing medieval and renaissance musical instruments. The experimental results showed a significant increase in the baselines and state-of-the-art performance compared with other domain adaptation methods

    Few-Shot Object Detection: Application to Medieval Musicological Studies

    No full text
    Detecting objects with a small representation in images is a challenging task, especially when the style of the images is very different from recent photos, which is the case for cultural heritage datasets. This problem is commonly known as few-shot object detection and is still a new field of research. This article presents a simple and effective method for black box few-shot object detection that works with all the current state-of-the-art object detection models. We also present a new dataset called MMSD for medieval musicological studies that contains five classes and 693 samples, manually annotated by a group of musicology experts. Due to the significant diversity of styles and considerable disparities between the artistic representations of the objects, our dataset is more challenging than the current standards. We evaluate our method on YOLOv4 (m/s), (Mask/Faster) RCNN, and ViT/Swin-t. We present two methods of benchmarking these models based on the overall data size and the worst-case scenario for object detection. The experimental results show that our method always improves object detector results compared to traditional transfer learning, regardless of the underlying architecture

    Triplet Loss Network for Unsupervised Domain Adaptation

    Get PDF
    Domain adaptation is a sub-field of transfer learning that aims at bridging the dissimilarity gap between different domains by transferring and re-using the knowledge obtained in the source domain to the target domain. Many methods have been proposed to resolve this problem, using techniques such as generative adversarial networks (GAN), but the complexity of such methods makes it hard to use them in different problems, as fine-tuning such networks is usually a time-consuming task. In this paper, we propose a method for unsupervised domain adaptation that is both simple and effective. Our model (referred to as TripNet) harnesses the idea of a discriminator and Linear Discriminant Analysis (LDA) to push the encoder to generate domain-invariant features that are category-informative. At the same time, pseudo-labelling is used for the target data to train the classifier and to bring the same classes from both domains together. We evaluate TripNet against several existing, state-of-the-art methods on three image classification tasks: Digit classification (MNIST, SVHN, and USPC datasets), object recognition (Office31 dataset), and traffic sign recognition (GTSRB and Synthetic Signs datasets). Our experimental results demonstrate that (i) TripNet beats almost all existing methods (having a similar simple model like it) on all of these tasks; and (ii) for models that are significantly more complex (or hard to train) than TripNet, it even beats their performance in some cases. Hence, the results confirm the effectiveness of using TripNet for unsupervised domain adaptation in image classification

    Adversarial Reconstruction Loss for Domain Generalization

    Get PDF
    The biggest fear when deploying machine learning models to the real world is their ability to handle the new data. This problem is significant especially in medicine, where models trained on rich high-quality data extracted from large hospitals do not scale to small regional hospitals. One of the clinical challenges addressed in this work is magnetic resonance image generalization for improved visualization and diagnosis of hip abnormalities such as femoroacetabular impingement and dysplasia. Domain Generalization (DG) is a field in machine learning that tries to solve the model's dependency on the training data by leveraging many related but different data sources. We present a new method for DG that is both efficient and fast, unlike the most current state of art methods, which add a substantial computational burden making it hard to fine-tune. Our model trains an autoencoder setting on top of the classifier, but the encoder is trained on the adversarial reconstruction loss forcing it to forget style information while extracting features useful for classification. Our approach aims to force the encoder to generate domain-invariant representations that are still category informative by pushing it in both directions. Our method has proven universal and was validated on four different benchmarks for domain generalization, outperforming state of the art on RMNIST, VLCS and IXMAS with a 0.70% increase in accuracy and providing comparable results on PACS with a 0.02% difference. Our method was also evaluated for unsupervised domain adaptation and has shown to be quite an effective method against over-fitting

    Adversarial Reconstruction Loss for Domain Generalization

    Get PDF
    The biggest fear when deploying machine learning models to the real world is their ability to handle the new data. This problem is significant especially in medicine, where models trained on rich high-quality data extracted from large hospitals do not scale to small regional hospitals. One of the clinical challenges addressed in this work is magnetic resonance image generalization for improved visualization and diagnosis of hip abnormalities such as femoroacetabular impingement and dysplasia. Domain Generalization (DG) is a field in machine learning that tries to solve the model’s dependency on the training data by leveraging many related but different data sources. We present a new method for DG that is both efficient and fast, unlike the most current state of art methods, which add a substantial computational burden making it hard to fine-tune. Our model trains an autoencoder setting on top of the classifier, but the encoder is trained on the adversarial reconstruction loss forcing it to forget style information while extracting features useful for classification. Our approach aims to force the encoder to generate domain-invariant representations that are still category informative by pushing it in both directions. Our method has proven universal and was validated on four different benchmarks for domain generalization, outperforming state of the art on RMNIST, VLCS and IXMAS with a 0.70% increase in accuracy and providing comparable results on PACS with a 0.02% difference. Our method was also evaluated for unsupervised domain adaptation and has shown to be quite an effective method against over-fitting
    corecore